[3.9] gh-102153: Start stripping C0 control and space chars in `urlsplit` (GH-102508...
authorMiss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Mon, 22 May 2023 10:42:37 +0000 (03:42 -0700)
committerRaspbian forward porter <root@raspbian.org>
Sat, 24 Jan 2026 09:41:14 +0000 (09:41 +0000)
commit144f0f935dd1f3fa59bf8218b0879f28db6c957b
treefad5bbcb4dba8bda8335e389ccdaf6c24460ecba
parentfc6e28f71f3527de7b97f66cf370c674b43b1420
[3.9] gh-102153: Start stripping C0 control and space chars in `urlsplit` (GH-102508) (GH-104575) (GH-104592) (#104593)

gh-102153: Start stripping C0 control and space chars in `urlsplit` (GH-102508)

`urllib.parse.urlsplit` has already been respecting the WHATWG spec a bit GH-25595.

This adds more sanitizing to respect the "Remove any leading C0 control or space from input" [rule](https://url.spec.whatwg.org/GH-url-parsing:~:text=Remove%20any%20leading%20and%20trailing%20C0%20control%20or%20space%20from%20input.) in response to [CVE-2023-24329](https://nvd.nist.gov/vuln/detail/CVE-2023-24329).

I simplified the docs by eliding the state of the world explanatory
paragraph in this security release only backport.  (people will see
that in the mainline /3/ docs)

(cherry picked from commit 2f630e1ce18ad2e07428296532a68b11dc66ad10)
(cherry picked from commit 610cc0ab1b760b2abaac92bd256b96191c46b941)
(cherry picked from commit f48a96a28012d28ae37a2f4587a780a5eb779946)

Co-authored-by: Illia Volochii <illia.volochii@gmail.com>
Co-authored-by: Gregory P. Smith [Google] <greg@krypto.org>
Gbp-Pq: Name 0013-3.9-gh-102153-Start-stripping-C0-control-and-space-c.patch
Lib/test/test_urlparse.py
Lib/urllib/parse.py